GPU-acceleration for Large-scale Tree Boosting

نویسندگان

  • Huan Zhang
  • Si Si
  • Cho-Jui Hsieh
چکیده

In this paper, we present a novel massively parallel algorithm for accelerating the decision tree building procedure on GPUs (Graphics Processing Units), which is a crucial step in Gradient Boosted Decision Tree (GBDT) and random forests training. Previous GPU based tree building algorithms are based on parallel multiscan or radix sort to find the exact tree split, and thus suffer from scalability and performance issues. We show that using a histogram based algorithm to approximately find the best split is more efficient and scalable on GPU. By identifying the difference between classical GPU-based image histogram construction and the feature histogram construction in decision tree training, we develop a fast feature histogram building kernel on GPU with carefully designed computational and memory access sequence to reduce atomic update conflict and maximize GPU utilization. Our algorithm can be used as a drop-in replacement for histogram construction in popular tree boosting systems to improve their scalability. As an example, to train GBDT on epsilon dataset, our method using a main-stream GPU is 7-8 times faster than histogram based algorithm on CPU in LightGBM and 25 times faster than the exact-split finding algorithm in XGBoost on a dual-socket 28-core Xeon server, while achieving similar prediction accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Rcs Prediction Using Multiresolution Shooting and Bouncing Ray Method on the Gpu

This paper presents a GPU-based multiresolution shooting and bouncing ray (MSBR) method with the kd-tree acceleration structure for the fast radar cross section (RCS) prediction of electrically large and complex targets. The multiresolution grid algorithm can greatly reduce the total number of ray tubes, as it adaptively adjusts the density of ray tubes for regions with different complexities o...

متن کامل

Accelerating the XGBoost algorithm using GPU computing

We present a CUDA-based implementation of a decision tree construction algorithm within the gradient boosting library XGBoost. The tree construction algorithm is executed entirely on the graphics processing unit (GPU) and shows high performance with a variety of datasets and settings, including sparse input matrices. Individual boosting iterations are parallelised, combining two approaches. An ...

متن کامل

XGBoost: Reliable Large-scale Tree Boosting System

Tree boosting is an important type of machine learning algorithms that is widely used in practice. In this paper, we describe XGBoost, a reliable, distributed machine learning system to scale up tree boosting algorithms. The system is optimized for fast parallel tree construction, and designed to be fault tolerant under the distributed setting. XGBoost can handle tens of millions of samples on ...

متن کامل

IRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES An exact algorithm for energy efficient acceleration of task trees on CPU/GPU architectures

We consider the problem of energy-efficient acceleration of applications comprising multiple interdependent tasks forming a dependency tree, on a hypothetical CPU/GPU system where both a CPU and a GPU can be powered off when idle. Each task in the tree can be invoked on both a GPU or a CPU, but the performance may vary: some run faster on a GPU, others prefer a CPU, making the choice of the low...

متن کامل

Large Scale Terrain Real-Time Rendering on GPU Using Double Layers Tile Quad Tree and Cuboids Bounding Error Metric

Improving terrain tile data selection efficiency, real-time loading of visible tile data and building GPU-based continuous Level of Details (LOD) are the key technologies for large scale terrain rendering on GPU. In this article, in order to reduce terrain tile data selection time, we build double layers tile quad tree for massive terrain data and organize tile data by designing Z-order space f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1706.08359  شماره 

صفحات  -

تاریخ انتشار 2017